home *** CD-ROM | disk | FTP | other *** search
- Path: mail2news.demon.co.uk!genesis.demon.co.uk
- From: Lawrence Kirby <fred@genesis.demon.co.uk>
- Newsgroups: comp.lang.c
- Subject: Re: C syntax question
- Date: Thu, 11 Apr 96 23:07:56 GMT
- Organization: none
- Message-ID: <829264076snz@genesis.demon.co.uk>
- References: <4ki00k$a4@mailhub.scitec.com.au>
- Reply-To: fred@genesis.demon.co.uk
- X-NNTP-Posting-Host: genesis.demon.co.uk
- X-Newsreader: Demon Internet Simple News v1.27
- X-Mail2News-Path: genesis.demon.co.uk
-
- In article <4ki00k$a4@mailhub.scitec.com.au>
- johns@rd.scitec.com.au "John Saunders" writes:
-
- >I have a question on C syntax that doesn't seem to be covered by the books
- >that I have. Maybe somebody with the full ANSI spec. could enlighten me.
- >
- >Consider the following code fragment:
- >
- > typedef unsigned char byte;
- > typedef int data;
- > typedef struct
- > {
- > byte byte;
- > struct
- > {
- > int dummy;
- > } data;
- > } struct_t;
- >
- >This is supposedly correct C, typedef names are ignored when defining
- >structure members. But not when defining variables so:
- > byte byte;
- >is an error when not inside a structure definition. Am I correct so far?
-
- Yes, every structure and union type defines a distinct namespace for its
- members.
-
- >I goal is to write a parser that handles the above correctly, all example
- >ANSI C parsers I have seen don't.
-
- The C language syntax is defined in terms of a context-free grammar. Therefore
- most if not all ANSI C parsers are based on that context-free garmmar.
- A context-free grammar is incapable of doing what you want. In addition
- to the basic syntax the standard defines a number of 'constraints' such that
- if any of them is violated (or a syntax error occurs) then the compiler
- must issue a diagnostic.
-
- >I started playing around with mixing
- >typedefs and type keywords in the same declaration to see how C compilers
- >react. This is where it got strange. Consider the code.
-
- Mixing a typedef name with any other type specifier violates a constraint
- so:
-
- > typedef unsigned char ubyte;
- > typedef signed char byte;
- >
- > ubyte signed i;
- > byte unsigned j;
- > signed ubyte k;
- > unsigned byte l;
-
- all 4 of these violate the standard and an ANSI compiler must generate a
- diagnostic.
-
- >The compilers I tried allowed the declaration of i and j (some gave warnings
- >and others didn't).
-
- The warnings are fine. The compilers that compiled it silently are not
- conforming (at least with the compiler options you used).
-
- >However none liked the declaration of k and l. From this
- >it would seem that a typedef name is allowed only as the first token in the
- >declaration.
-
- Not true. However it must be the only type-specifier. It can be preceded by
- a storage-class-specifier (e.g. typedef, static, register etc.) and/or 1 or 2
- qualifiers (const, volatile).
-
- >All the example ANSI C grammars that I have seen allow any number
- >of typedef names or type keywords in any order.
-
- The grammars do that because they can't do any better, that is left to the
- written constraints. In this case in section 6.5.2:
-
- "Each list of type specifiers shall be one of the following sets (delimited
- by commas, when there is more than one set in a line); the type specifiers
- may occur in any order, possibly intermixed with the other declaration
- specifiers.
-
- - void
-
- - char
-
- - signed char
-
- - unsigned char
-
- - short, signed short, short int, or signed short int
-
- - unsigned short, or unsigned short int
-
- - int, signed, signed int, or no type specifiers
-
- - unsigned, or unsigned int
-
- ...
-
- - enum-specifier
-
- - typedef-name"
-
- So typdef-name is a single set in this context and may not appear with
- any of the others in the same type specifier list, including signed and
- unsigned.
-
- >This is causing a major
- >problem in being able to parse "byte byte;" correctly in the structure
- >definition. I.e. is the second occurance of "byte" supposed to be treated as
- >a typedef name or the member name?
-
- As far as the syntax analysis is concerned it is simply an identifier.
- After the syntax analysis the compiler can determine that it is a structure
- member name from its position in the syntax tree.
-
- >Has anyone tackled this problem before?
-
- I don't believe there is a problem to tackle (beyond some of your compilers
- apparently being non-conforming). You're trying to make syntax analysis do
- something it is incapable of. The standard requires many sanity checks beyond
- those inherent in the syntax definition.
-
- --
- -----------------------------------------
- Lawrence Kirby | fred@genesis.demon.co.uk
- Wilts, England | 70734.126@compuserve.com
- -----------------------------------------
-